Search CORE

95 research outputs found

Log Mining Using Generalized Association Rules

Author: Mohd. Helmy Abd. Wahab
Publication venue
Publication date: 01/01/2004
Field of study

Explosive growth in size and usage of the World Wide Web has made it necessary for Web site administrators to track and analyze the navigation patterns of Web site visitors. To achieve this goal, the use of web mining tool is necessary. Web mining can be defined as the use of data mining techniques to automatically discover and extract information from web documents. Since Data Mining is primarily concerned with the discovery of knowledge and aims to provide answers to questions that people do not know how to ask, it is not an automatic process. Rather one has to exhaustively explores very large volumes of data to determine otherwise hidden relationships. The process extracts high quality information that can be used to draw conclusions based on relationships or patterns within the data. However, data mining technique are not easily applicable to Web data due to problems both related with the technology underlying the Web and the lack of standards in the design and implementation of Web pages. Information collected by the Web servers are kept in the server log is the main source of data for analyzing user navigation patterns. Once logs have been pre-processed and sessions have been obtained, there are several kinds of access pattern mining that can be performed depending on the needs of the analyst. Since the method use in this study relied on relatively simple techniques therefore the information gathered is adequate for real user profile data due to the noise in the data has to be first tackled. In this study, Data Mining techniques known as generalized association rules was used in order to get some insights into website usage pattern. For the purpose of this study, server logs from tutor.com portal were retrieved, pre-processed and analyzed. An important finding from this study is that Mathematics subject generally popular from UPSR, PMR and UPSR levels. On the contrary, arts subjects are not popular to Tutor.com users. The system administrator may consider evaluating the content and the link for such subjects, so that the real problem can be identified

Universiti Utara Malaysia: UUM eTheses

Discovering Web Server Logs Patterns Using Generalized Association Rules Algorithm

Author: Mohamad Farhan Mohamad Mohsin
Mohd Helmy Abd Wahab
Mohd Norzali Haji Mohd
Publication venue: 'IntechOpen'
Publication date: 01/03/2010
Field of study

IntechOpen

Two-class classification: comparative experiments for chronic kidney disease

Author: Abd Wahab Mohd Helmy
Johari Ahmad Amni
Mustapha Aida
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Over two million of population across worldwide is currently depending on dialysis treatment or a kidney transplant to survive from kidney disease. Therefore, it is imperative for health agencies such as hospitals or insurance companies to predict the probabilities of patients who suffers from chronic case of kidney diseases, hence requiring medical attentions. This study performs a comparative experiment on prediction of chronic kidney disease via a classification methodology. Two supervised classification algorithms are used to build the classification model, which are Two-Class Decision Forest and Two-Class Neural Networks. Experimental results showed that Neural Network performed better based on all features but Decision Forest produced optimal performance with high accuracy, and precision as compared to Neural Networks and other algorithms from the literature such as K-Nearest Neighbor, Support Vector Machine, and Rule Induction

UTHM Institutional Repository

Crossref

Elderly care monitoring system with IoT application

Author: Abd Wahab Mohd Helmy
Abdul Jamil Muhammad Mahadi
Ambar Radzi
Bong Jia Cheng
Publication venue: Springer Nature
Publication date: 01/01/2020
Field of study

Falls among elderly can pose serious consequences such as injury or even fatal ones. Therefore, it is essential that fall are detected early and away to that is by using IoT platform. The authors have been developing a wearable device for elderly monitoring system utilizing accelerometer. The data from accelerometer is connected to an Internet-of-Things (IoT) platform called ThingSpeakTM. Based on IoT platform, elderly patients can be remotely monitored as long as the care providers have good internet access. The paper presents the experimental results of determining the sensitivity and specificity of the accelerometer used in the proposed system. This is the first step for developing an accurate data acquisition for monitoring purposes. Based on the experimental results, the average percentage for sensitivity obtained for this device is 73.3%, while the average for specificity obtained is 89.3%. Both sensitivity and specificity tests shows promising results which indicates that the device only has a fail rate of 26.7% and error rate of 10.7%

UTHM Institutional Repository

Crossref

Data pre-processing on web server logs for generalized association rules mining algorithm

Author: Abd Wahab Mohd Helmy
Hanafi Hafizul Fahri
Mohamad Mohsin Mohamad Farhan
Mohd Mohd Norzali
Publication venue: World Academy of Science, Engineering and Technology
Publication date: 01/01/2008
Field of study

Web log file analysis began as a way for IT administrators to ensure adequate bandwidth and server capacity on their organizations website. Log file data can offer valuable insight into web site usage.It reflects actual usage in natural working condition, compared to the artificial setting of a usability lab.It represents the activity of many users, over potentially long period of time, compared to a limited number of users for an hour or two each.This paper describes the pre-processing techniques on IIS Web Server Logs ranging from the raw log file until before mining process can be performed. Since the pre-processing is tedious process, it depending on the algorithm and purposes of the applications

UUM Repository

Comparing the knowledge quality in rough classifier and decision tree classifier

Author: Abd Wahab Mohd Helmy
Mohamad Mohsin Mohamad Farhan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

This paper presents a comparative study of two rule based classifier; rough set (Rc) and decision tree (DTc).Both techniques apply different approach to perform classification but produce same structure of output with comparable result. Theoretically, different classifiers will generate different sets of rules via knowledge even though they are implemented to the same classification problem.Hence, the aim of this paper is to investigate the quality of knowledge produced by Rc and DTc when similar problems are presented to them.In this case, four important performance metrics are used as comparison, the accuracy of classification, rules quantity, rules length and rules coverage.Five dataset from UCI Machine Learning are chosen and then mined using Rc toolkit namely ROSETTA while C4.5 algorithm in WEKA application is chosen as DTc rule generator. The experimental result shows that Rc and DTc own capability to generate quality knowledge since most of the results are comparable. Rc outperform as an accurate classifier, produce shorter and simpler rule with higher coverage. Meanwhile, DTc obviously generates fewer numbers of rules with significant difference

UUM Repository

UTHM Institutional Repository

Pattern extraction for programming performance evaluation using directed apriori

Author: Abd Wahab Mohd Helmy
Md Norwawi Norita
Mohamad Mohsin Mohamad Farhan
Zaiyadi Mohd Fairuz
Publication venue
Publication date: 24/06/2009
Field of study

Computer programming is taught as a core subject in Information Technology related studies.It is one of the most essential skills which each student has to acquire.However, there is still a small number of students who are unable to write a program well. Several researches indicated that there are many factors which can affect student programming performance.Thus, the objective of this paper is to investigate the significant factors that may influence students programming performance using information from previous student performance.Since data mining data analysis able to discover hidden knowledge in database, a programming dataset which comprises information about performance profile of Bachelor of Information Technology students of Faculty of IT, Universiti Utara Malaysia in the year 2004-2005 were explored using data mining technique.The dataset consists of 421 records with 70 mixture type of attributes were pre-processed and then mined using directed association rule (AR) mining algorithm namely apriori.The result indicated that the student who has a programming experience in advanced before starts learn programming in university and scored well in Mathematics and English subject during SPM were among the factor that contributes to a good programming grades

UUM Repository

Discovering usage patterns from web server logs

Author: Abd. Wahab Mohd Helmy
Siraj Fadzilah
Yusoff Nooraini
Publication venue
Publication date: 01/01/2005
Field of study

As the amount of information available on the World Wide Web (WWW) increases rapidly, the number of sites that hold particular information also increases. In order to have some insights o the site usage, system administrator needs tools that can aid in his usage site’s analysis.To achieve this goal, the use of web mining too is necessary to discover the usage pattern of a particular site. For the purpose of this study, server logs from the educational portal were retrieved, pre-processed and analyzed. Information collected by the Web servers are kept in the server logs and used as the main source of data for analyzing users’ navigation patterns. Once the server logs have been preprocessed and sessions have been obtained, there are several kinds of access pattern mining that can be performed, depending on the needs of the analyst. In this study, data mining technique known as Generalized Association Rule was used in order to get some insights into website usage pattern. The findings from this study provide an overview of the usage pattern of particular educational portal. The study also demonstrates how Generalized Association Rule can be used in site usage analysis. Such a technique enables the discovery of hidden information within the web server logs using data mining technique

UUM Repository

The preferable test documentation using IEEE 829

Author: A. Noraziah
Abd Wahab Mohd Helmy Bin
Mohd Sidek Roslina
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2011
Field of study

During software development, testing is one of the processes to find errors and aimed at evaluating a program meets its required results. In testing phase there are several testing activity involve user acceptance test, test procedure and others. If there is no documentation involve in testing the phase the difficulty happen during test with no solution. It because no reference they can refer to overcome the same problem. IEEE 829 is one of the standard to conformance the address requirements. In this standard has several documentation provided during testing including during preparing test, running the test and completion test. In this paper we used this standard as guideline to analyze which documentation our companies prefer the most. From our analytical study, most company in Malaysia they prepare document for Test Plan and Test Summary

UTHM Institutional Repository